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ABSTRACT 



This study examines some of the literature on college 
faculty supply and demand and asks whether it is possible to adopt 
assumptions from the previous research to construct a complex model of 
faculty workforce using the available data. The study involved a 
comprehensive review of the literature; numerous interviews conducted by 
telephone, e-mail, and in person to discuss available datasets and various 
approaches to faculty supply and demand; analysis of 14 national datasets; 
and, finally, in-depth review of four datasets to assess their utility for 
modeling. The model developed had the following components: enrollment 
(undergraduate, masters, doctoral) broken out by gender and ethnicity; 
degrees (masters and doctoral) ; postdoctoral appointments; nonfaculty 
research staff; faculty population (full-time, instructional, research, and 
public service) broken out by rank within tenure status by discipline and by 
tier of institution, and including retirement rates, quit rates, and 
mortality by discipline; faculty workload for full-time faculty; and research 
activity, including the need for post-doctoral students, nonfaculty research 
staff, and degree productivity. This study suggests that it is impossible to 
construct a complex model of faculty supply and demand with currently 
available data. The report concludes with recommendations for improved data 
collection. (Contains 31 references.) (CH) 
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The Glut of Ph.D.s - Complex Models for the Faculty Workforce 



Of the many topical interests for research on higher education, few are as critical 
to the policy arena or have as significant an impact on the infrastructure of colleges and 
universities as faculty supply and demand. This paper addresses two questions of 
interest: (1) What do the national datasets have to offer studies of faculty supply and 
demand? and (2) Is it possible to adopt the assumptions of previous research and 
construct a complex model of the faculty workforce using all available data? The results 
suggest that that while existing data collection efforts allow for many types of complex 
policy studies about faculty, it is impossible to construct a complex model of faculty 
supply and demand. 




The Glut of Ph.D.s - Complex Models for the Faculty Workforce 



I. Introduction 

Of the many topical interests for research on higher education, few are as critical to 
the policy arena or have as significant an impact on the infrastructure of colleges and univer- 
sities as faculty supply and demand. This research serves many purposes and involves many 
different approaches. At one end of the spectrum are the production and utilization projec- 
tions of Massy and Goldman (1995), the faculty prospects models of Bowen and Sosa 
(1989), and the study of graduate education by Bowen and Rudenstine (1992). These com- 
plex analyses are supplemented by the cumulative knowledge base of hundreds of descriptive 
studies conducted by individual scholars and by federal agencies such as the National Sci- 
ence Foundation (NSF) and the National Center for Education Statistics (NCES) to examine 
combinations of faculty characteristics such as age, tenure, rank, discipline, gender, citizen- 
ship, and ethnicity. 

The common thread to studies as disparate in purpose as calculating availability sta- 
tistics for affirmative action, examining faculty workload, analyzing doctoral unemployment, 
and predicting the effect of early retirement programs is their reliance on national data about 
the faculty population. On the surface, the potential for new research on faculty seems inex- 
haustible. Yet when the data behind the research are separated from the many assumptions 
and models which are put forward, it is clear that researchers have some strict limitations 
placed on their work. 
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This paper attempts to clarify the types of research on faculty supply and demand 
which are possible, given the inherent limitations of the national data, and to suggest the 
types of assumptions and models which may be supported. This represents somewhat of a 
departure from the traditional approach to conducting research on faculty. Usually, the ex- 
isting literature is reviewed and the weight of previous research is used to document the va- 
lidity of a new model or set of assumptions. Scholars then look to the various national da- 
tasets or conduct a new data collection effort to test their hypotheses. For the purposes of 
this study, this order is reversed. The questions of interest are: 

(1) What do the national datasets have to offer studies of faculty supply and demand? 

(2) Is it possible to adopt the assumptions of previous research and construct a com- 
plex model of the faculty workforce using all available data? 

This proposed review of data elements, sampling techniques, and weighting issues in 
the national datasets has little utility if not first informed by the types of assumptions and 
models which occur in the literature. What are the basic studies of faculty characteristics 
which have been conducted by scholars and are most relevant to current policy concerns? 
What kinds of models are present in the literature? 

A review of the literature on faculty supply and demand suggests that there are at 
least 19 different types of assumptions or dimensions related to this topic of research. In Ta- 
ble 1 , these are arrayed by the major studies in which they appear. Table 2 arrays these as- 
sumptions by those national data sources which collect relevant fields. 

Each type of assumption will be discussed in terms of how it informs models of fac- 
ulty supply and demand, its use in the current research literature, and its availability in the 
national data. Specific focus will be given to the problems of using each assumption in par- 
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ticular kinds of studies. It is hoped that out of this discussion will come insights into the 
most appropriate and meaningful ways to use the national datasets. After the discussion of 
assumptions, a complex model of faculty supply and demand will be constructed based on 
what is learned from previous research and the utility of the datasets. 

II. Methodology 

Four related research efforts were undertaken as part of this study: 

(1) A comprehensive literature review was conducted with the bibliographic search 
tools ERIC, Education Index, and Higher Education Abstracts and the commercial search en- 
gines of the World Wide Web. As a result, the research of Massy and Goldman (1995), Bo- 
wen and Sosa (1989), Bowen and Rudenstine (1992), and COSEPUP (1995) were examined 
in depth, along with relevant published studies by the National Research Council (NRC), the 
NSF, and the Council of Graduate Schools (CGS). Literature review essays such as Geiger 
(1997) and Hartle and Galloway (1996) also helped inform this study. 

(2) Numerous interviews were conducted by phone, email, and in person to discuss 
the datasets and various approaches to faculty supply and demand. These included conver- 
sations with Ernie Benjamin, AAUP; Sam Bettinger, Pinkerton; Joan Burrelli, NSF; Law- 
rence Burton, NSF; Michael Cohen, NCES; Valerie Martin Conley, Virginia Tech; Mary 
Golladay, NSF; Theresa Grimes, Quantum Research Corporation (QRC); Linda Hardy, NSF; 
Susan Hill, NSF; Steve Hunt, U.S. Dept. Of Education; Rolf Lehming, NSF; Linda Parker, 
NSF; Carolyn Shettle, NSF; Peter Syverson, CGS; Veerle Van Meel, QRC; Jim Voytuk, Na- 
tional Academy of Sciences (NAS); and Linda Zembler, NCES. 




(3) Fourteen different national datasets on faculty are discussed in the literature. Each 
of these was reviewed to evaluate its utility for research on faculty supply and demand. This 
review included an examination of the survey sample, data elements, weighting procedures, 
methodology reports, and published and unpublished studies. In some cases, interviews were 
conducted with the agency staff in charge of each survey. 

(4) Finally, several of the national datasets were selected for more in-depth review 
and analysis to better evaluate their utility for modeling. These include the Survey of Earned 
Doctorates (SED), the Survey of Doctorate Recipients (SDR), the National Study of Post- 
secondary Faculty (NSOPF), and the NSF-NIH Graduate Student Survey (GSS). Microdata 
licenses were obtained from NSF for the SED and SDR. The CD-Rom version of the Public 
Access Data Analysis System (DAS) was used for the NSOPF. The raw data for the GSS 
were obtained from the Quantum Research Corporation. Summary data for the SED and the 
GSS were also obtained using the online WebCaspar system and for the SDR using the on- 
line, public version of NSF's SESTAT system. 

A methodological log was maintained throughout the course of this research and 
various kinds of peer debriefings were held. The results of the review of the literature and 
the national datasets were presented in a paper at the 1997 Forum of the Association of In- 
stitutional Research in Orlando (Milam, 1997). Numerous discussions with NCES and NSF 
staff were held as a result of this paper and these helped guide the choice of datasets to re- 
view in depth. The author is grateful to NSF, NCES, and the Association for Institutional 
Research (AIR), which made this study possible through the funding of an NSF/NCES/AIR 
Research Fellowship. 
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III. Assumptions and Dimensions of Faculty Supply and Demand 



Table 1: Assumptions/Dimensions by Research Study 



Assumption 


Massy & 
Goldman 


Bowen 
& Sosa 


Bowen & 
Rudenstine 


COSEPUP 


NSF 

Issue 

Briefs 


NRC 


CGS 


UG enrollment/projections 


Yes 














MA enrollment/projections 


Yes 






Yes 








DR enrollment/projections 


Yes 


Yes 




Yes 








Time to degree 


Yes 




Yes 


Yes 








Financial support 






Yes 


Yes 


Yes 






Degree productivity 


Yes 


Yes 


Yes 


Yes 


Yes 






Employment plans/rate 


Yes 


Yes 




Yes 


Yes 




Yes 


Ethnicity/citizenship/gender 


Yes 




Yes 


Yes 


Yes 






Post-docs 


Yes 




Yes 


Yes 


Yes 




Yes 


Faculty workload 


Yes 


Yes 












Rank 


Yes 










Yes 




Tenure 




Yes 












Quit Rates 


Yes 


Yes 












Retirement 


Yes 


Yes 












Mortality/Disability 


Yes 


Yes 












Departmental behavior 


Yes 




Yes 






Yes 




Tier/sector structures 


Yes 


Yes 


Yes 










Research activity 


Yes 










Yes 




Disciplines 


Yes 


Yes 


Yes 


Yes 


yes 


Yes 





Undergraduate enrollment/projections 

In their model of departmental behavior, Massy and Goldman (1995) use various cross- 
sectional 1980 data to produce regression equations to predict departmental demand for the en- 
dogenous variables of graduate students, faculty, and post-doctoral fellows. One of their exoge- 
nous variables is undergraduate FTE enrollment by institution. While the number of bachelors 
degrees awarded by major is also an exogenous variable, the use of enrollment data is particu- 
larly interesting. This documents the common perception that the need for faculty and for 
graduate teaching assistants (GTAs) is driven in part by undergraduate enrollment and that re- 
search and doctoral-granting Carnegie institutions rely more heavily on GTAs to meet teaching 
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needs. The U.S. News, College Board, and other admissions guide surveys use this as a measure 
of quality in undergraduate education. 

Table 2: Assumptions/Dimensions by Source of Data 



Assumption 


SED 


SDR 


NSOPF 


GSS 


NRC 


s 


SA 


EF 


c 


F 


CGS 


UG enrollment/projections 
















Yes 








MA enrollment/projections 
















Yes 






Yes 


DR enrollment/projections 
















Yes 






Yes 


Time to degree 


Yes 


Yes 


Yes 


















Financial support 


Yes 


Yes 


Yes 


Yes 












Yes 




Degree productivity 


Yes 
















Yes 






Employment plans/rate 


Yes 


Yes 


Yes 


















Ethnicity/citizenship/gender 


Yes 


Yes 


Yes 


Yes 




Yes 


Yes 


Yes 


Yes 






Post-docs 


Yes 


Yes 


Yes 


Yes 
















Faculty workload 






Yes 


















Rank 






Yes 






Yes 


Yes 










Tenure 






Yes 






Yes 


Yes 










Quit Rates 




Yes 


Yes 


















Retirement 




Yes 


Yes 


















Mortality /Disability 




Yes 


Yes 


















Departmental behavior 
























Tier/sector structures 


Yes 


Yes 


Yes 


Yes 


Yes 


Yes 


Yes 


Yes 


Yes 


Yes 




Research activity 


Yes 


Yes 


Yes 


Yes 


Yes 










Yes 




Disciplines 


Yes 


Yes 


Yes 


Yes 


Yes 






Yes 


Yes 




Yes 



While undergraduate enrollments are discussed by the NRC's Committee on Science, En- 
gineering, and Public Policy (COSEPUP, 1995) in its study Reshaping the Graduate Education 
of Scientists and Engineers , particularly in regard to changing demographics, none of the com- 
plex models of faculty and supply and demand utilize enrollment data to any large extent. Massy 
and Goldman do not make use of the discipline-specific data by institution which are collected 
by the annual NCES IPEDS survey of Fall enrollment (EF). This survey enrollment data by dis- 
cipline at the CIP code and by student level, full/part-time student status, gender, and ethnicity. 
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Bowen and Sosa (1989) use enrollment as part of their calculation of faculty workload 
predictions and their impact on supply and demand. While they start with enrollment projec- 
tions based on the IPEDS EF survey, they use data aggregated by institution. This is probably 
because they relied on data available to the public in the late 1980s and few researchers were 
using the CIP-specific data which were collected but not released. They write that "Data show- 
ing enrollments by field of study exist only at the level of the individual institution, and even 
then they are often incomplete or incompatible with data from other institutions" (p. 46). To 
obtain discipline-specific enrollments, they used the IPEDS Completions (C) survey and applied 
percentages of degrees conferred by clusters of disciplines to their enrollment projections. These 
were then used to calculate discipline-specific student-faculty ratios. 

There are some problems in using degree data as a proxy variable for enrollment. Bowen 
and Sosa (1989) recognize that "students who go on to receive one kind of degree can cross- 
register in courses taught by faculty members who are in other fields of study," so that "we al- 
most certainly underestimate shares of enrollment in the arts and sciences when we look only at 
degrees conferred" (p. 46). A similar argument may be made about the problem of using enroll- 
ment by major. An Induced Course Load Matrix (ICLM) model shows the relationships of de- 
partmental consumption and contribution. Many majors take courses outside of their major and 
many departments serve non-majors in their courses. This presents a problem with using en- 
rollment data by major for documenting faculty workload. 

Clearly these studies recognize the importance and also some of the problems of using 
enrollment data to predicting faculty demand. However, none make use of the discipline- 
specific enrollment data which are available. The IPEDS EF data files available on the NCES 
web site and on WebCaspar are at the level of the first two digits of the CIP code. The IPEDS 
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CD-Rom provides access to the full 6 digit CIP code data. It is also possible for researchers to 
request special cross-tabulations of these data from the National Data Resource Center (NDRC). 

Future researchers using the methodology of Massy and Goldman (1995) and Bowen 
and Sosa (1989) now have discipline-specific, enrollment data readily available. These data are 
critical to calculating faculty workload and to projecting faculty demand based on workload. 
Both studies use student full-time equivalencies (FTE) instead of headcount, using the standard 
NCES calculation which equates 1 full-time headcount to 1 FTE and 3 part-time headcount to 1 
FTE. While the IPEDS Institutional Characteristics Survey (IC) has in the past included ques- 
tions about student credit hours (SCH) by level, these are not at the discipline level and these 
data are not being reported in the raw data files, in part because of recognized problems in insti- 
tutional reporting methods. Without SCH data, researchers must rely on the NCES calculation. 
The results of this calculation are suspect for those institutions which have significant part-time 
enrollments such as urban institutions and community colleges. 

The benefit of using the IPEDS EF Survey is that collects data on the entire student 
population, not just a sample. While some institutions do not respond, these are usually proprie- 
tary and technical schools and the previous year's data may be substituted. When assumptions 
about time-to-degree and graduation rates are applied to cohorts of students, it is possible to pre- 
dict the enrollment component of supply by discipline. 

The "BA-PhD Nexus" is described by Bowen and Rudenstine (1992) as a problem in 
tracking Ph.D. cohorts. If Ph.D. recipients of a given year are used for analysis, then the data on 
year of B.A. varies widely. For this reason, the authors build cohorts of Ph.D. recipients based 
on the year they receive their B.A., not the year of the doctorate, allowing for "more precise 
matching of numbers of doctoral recipients with conditions that prevailed at the time most of 
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them began graduate study" (p. 42). Using time-to-degree calculations based on B.A. cohorts 
and graduation rates, undergraduate enrollment may be used to project doctoral enrollment. 

Master's enrollment/projections 

In Massy and Goldman's (1995) departmental behavior model, one of the exogenous 
variables is masters degrees by major. As in the use of undergraduate enrollment data, these data 
are simply one variable in a complex regression equation, recognizing that "Faculty size depends 
on general undergraduate degrees, in-major degrees, masters degrees, and doctoral degrees be- 
cause of instructional needs" (p. 3-5). While masters degrees are included with total degrees in 
Bowen and Sosa's (1989) calculation of faculty workload and applied to overall enrollment ra- 
tios, no special significance is given to masters enrollment data. f 

Masters students are one consideration of departmental activity considered by Bowen and 
Rudenstine. Still, most of these complex models do not incorporate masters enrollment in any 
significant way in predicting faculty supply and demand. However, meaningful statistics may 
arise from calculating student-level, faculty workload. With this method, assumptions could be 
made about the pipeline of graduate students. The NSOPF, SDR, and SED surveys collect data 
on graduation dates by degree for each respondent and could be used to calculate time to the BA, 
MA, and doctorate. While time-to-degree and cohort tracking methodologies are quite compli- 
cated, these survey data have not been used to their fullest extent. 

Degrees conferred and enrollment data for masters programs are also useful in projecting 
potential community college faculty, which traditionally do not need the doctorate. Most faculty 
demand models do not adequately capture community college needs. Massy and Goldman 
(1995) only address research and doctoral institutions and Bowen and Sosa (1989) only predict 
results for four-year institutions and above. 



In addition to the SDR, the NSF SESTAT system includes the National Survey of Recent 
College Graduates (NSRCG) and the National Survey of College Graduates (NSCG). All three 
SESTAT surveys include occupation codes for postsecondary faculty in 29 clusters of disci- 
plines. These provide some estimates of the faculty population weighted to census estimates. 
With models based on the NSRCG, masters enrollment data could be used to project faculty 
supply and demand for those with masters degrees, similar to the ways in which the NSOPF, 
SDR, and SED could be used to project supply and demand for those who will earn doctorates. 

This discussion suggests that masters enrollment and degree data are useful but little used 
components of complex faculty supply and demand models. While Bowen and Rudenstine pay 
great attention to the "BA-PhD Nexus," a similar potential relationship for predicting enrollment 
exists with MA-Ph.D. cohort enrollment tracking. Once established, there are other sources for 
data besides the IPEDS EF and IPEDS C reports which can be used to project enrollment and 
degree variables for supply and demand. Perhaps the most important and consistent of these is 
the annual CGS-GRE Graduate Survey, which collects full- and part-time data by discipline/ 
program by institution. The GSS is similar and includes relevant data on financial support, but 
only collects data on science and engineering disciplines. 

Doctoral enrollment/projections 

Significant efforts have been made in predicting discipline-specific doctoral enrollment 
using data from the IPEDS EF, CGS-GRE, and GSS surveys. Bowen and Sosa (1989) include 
doctoral enrollments in their calculation of overall faculty workload and use these ratios in their 
projections of demand. Massy and Goldman (1995) include headcount doctoral enrollment as 
one of their three endogenous variables in their departmental behavior model. Yet it is doctoral 
degree productivity trends, not enrollment projections, which are at the heart of these two re- 
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search models. There is much still to be learned from enrollment data if assumptions about time- 
to-degree and graduation rates by cohort tracking years are incorporated. 

While only the IPEDS EF data are available for undergraduate enrollment by discipline, 
the CGS-GRE survey and the GSS between them provide detailed data on gender, ethnicity, full- 
and part-time status, and funding. While the GSS is limited to science and engineering disci- 
plines, this sample includes social science fields such as psychology. The CGS-GRE are not 
readily available to the public in electronic format. They are published in annual studies by CGS 
and are the subject of much discussion among graduate deans. Both surveys are collected by 
program, with breakouts for masters and doctoral programs within field specialties. The GSS is 
more detailed in its disciplinary breakout, in part because of its collection of postdoctoral data on 
medical residency specialties. 

In conjunction with undergraduate and masters enrollment, doctoral enrollment projec- 
tions are a primary feeder for predicting degree productivity. Again, these data are underutilized 
in faculty and supply models, in part because previous researchers did not have access to them at 
the appropriate disciplinary level and in part because of the need to understand data administra- 
tion issues surrounding the best way to use each survey. 

Time to degree 

Attainment rates and time-to-degree are combined as "propensity to graduate" and "pro- 
pensity to drop out" by Massy and Goldman (1995). The authors build an equation that factors 
in phases in which students gestate, progress, sustain, erode, and stagnate. These issues of 
equally of interest to Bowen and Rudenstine (1992), who also examine issues of cost and student 
financial support. 
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Time-to-degree is a factor mentioned above in relationship to using enrollment and 
graduation rates to predict degree productivity. While this is examined tully by Bowen and 
Rudenstine (1992) in their discussion of the "BA-PhD Nexus," differentiated enrollment, 
graduation, and time-to-degree rates have not been applied to cohorts of masters students. This 
is particularly needed if prediction is to be done of community college needs. 

Data on time-to-degree are available in the SED, the SDR, and the NSOPF by investi- 
gating completion dates in individual survey responses. The SDR is a sample of faculty and 
non-faculty, while the SED is collected from the entire population of research doctorate recipi- 
ents. Using the SED, researchers can study cohorts of recipients based on graduation year and 
discipline of BA and MA. 

Financial support 

Bowen and Rudenstine (1992) recognize the impact of financial aid and research funding 
support on graduate education. This factor does not appear, however, in any of the complex 
models of faculty supply and demand. While much financial aid research has been conducted on 
the factors affecting student attrition, retention, and graduation, this is usually focused on the un- 
dergraduate degree. The data exist, though, for sophisticated and useful work to be done on the 
impact of financial support on predicting doctoral degree productivity and therefore faculty sup- 
ply. 

The GSS includes funding types by each federal agency, self-support, and other forms of 
aid (NSF, 1995). These data are reported in the NSF Institutional Profile series for researchers to 
examine trends by institution and by field of study. It is possible, using the GSS and the SED, to 
investigate departmental behavior at an individual institution and to aggregate these data up to 
tiers of Carnegie/control to make assumptions about the impact of aid on graduation. A robust 

,2 ie 



model of faculty supply needs to account for variations in graduate student funding patterns in 
predicting doctoral degree productivity. 



Degree productivity 

The SED and IPEDS C data represent the entire population of degree recipients at the 
doctoral level. The IPEDS C is available in most sources at the 2 digit CIP code level, while the 
SED has a different but equally complex disciplinary taxonomy. A crosswalk between the two 
taxonomies is available as part of WebCaspar. The SED only includes research doctorates, while 
the C includes all doctorates. This distinction is troubling when comparing data from the two 
surveys, and it is important to reconcile any assumptions about the SED population by discipline 
with data from the C. For example, data on the percentage by discipline who plan to enter aca- 
deme could be applied to the discipline data from the C to obtain the a higher number of poten- 
tial faculty. If only the SED data are used, the number of potential faculty could be under- 
reported. The SED data allow for breakouts by type of doctoral degree, something not collected 
with the C. Assumptions about type of degree could be incorporated along with post-doctoral 
plans in better understanding and predicting faculty career paths. One problem with comparing 
the two reports is the slightly different survey year. 

One problem in using the SED is that the data continue to be collected after the reporting 
year is over. This increases the response rate but allows for different results obtained with the 
most recent versus published versions of the data. The IPEDS C is completed by each institu- 
tion based on official census data, while the SED is completed by the doctoral recipient, usually 
as a requirement of graduation. 

Data on many characteristics of the doctoral degree population are available from the 
SED, including such fields as dissertation topic, family educational history, post-doc status, and 
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employment plans. In the interviews conducted with NSF staff as part of this study, some dis- 
satisfaction was expressed about the disciplinary taxonomy of the SED, particularly in relation- 
ship to the taxonomy of the SDR. The changing structure of the disciplines is difficult to map 
and the SED has been criticized for failing to adequately document the changing nature of the 
disciplines. A similar methodological issue surrounds the NSOPF, which failed to collect ade- 
quate responses from health science faculty. 

The SED is the basis of the NRC's Doctorate Records File and is used to create the bien- 
nial sample for the SDR. The SDR data elements include all fields available in the SED and al- 
low for interesting comparisons of contrast, such as whether students who plan to enter academe 
or a post-doc actually do so. The SDR is also used extensively for the calculation of unemploy- 
ment rates. 

A number of publications use the SED data, including agency reports from NSF and 
NCES. These depict trends in degree productivity by discipline and are the most visible type of 
research on disciplinary behavior and its relationship to faculty supply and demand. 

Data from the SED are published annually by the NRC and are used by affirmative action 
and equal opportunity officers (AA/EEO) to calculate faculty availability statistics. In the Na- 
tional Study of Faculty Availability and Utilization (NSFAU), the author documents that NRC 
publications on the SED are the primary source of faculty availability data by discipline used for 
AA/EEO statistics (Milam, 1995a, 1995b). 

Complex faculty availability models produced at the University of Washington, the Uni- 
versity of Colorado, and elsewhere weight the SED data by degree year to estimate gender and 
ethnicity percentages for each faculty rank. In these models, the current year's data are used for 
estimating the gender and ethnicity availability for new assistant professors. For associate and 
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full professor hires, AA/EEO officers sometimes combine and weight different years of data, re- 
flecting different assumptions about time from degree and rank transitions. 

Sixty-six (52.0%) of the 127 doctoral-granting institutions which participated in the 
NSFAU reported that they rely on current year NRC data to complete the eight factor analyses 
which are required by the OFCCP and the EEOC. Thirty-eight institutions (29.9%) use trend 
data to aggregate across SED survey years. These models inform those of faculty supply and 
demand, particularly in the way they incorporate SED data for estimating faculty availability by 
rank. 

Employment plans/rate 

Only the SED provides data on whether doctoral recipients intend to find work in aca- 
deme. The AA/EEO models discussed above are somewhat flawed because they are not based 
on the percentage of doctoral recipients who wish to enter academe. This statistic varies widely . 
by discipline. In the past, these data were often unpublished and unavailable to researchers 
without a microdata license. The newest online version of WebCaspar now includes these data, 
aggregated to the appropriate CASPAR disciplinary taxonomy. Table 3 provides these data for 
clusters of disciplines. Additional products will be prepared from this research to provide 
AA/EEO officers with detailed trends over time of the percent of women and minorities who 
wish to enter academe. 

COSEPUP (1995) and NRC (1995) report on these data, explaining that "More New 
Ph.D.s Have Uncertain Employment Plans." These studies also show that there is growing reli- 
ance on post-doctoral appointments, perhaps because of increased difficulty in obtaining tradi- 
tional, academic appointments. 
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The number and percentage of new recipients seeking or with definite positions in aca- 
deme serve as the basic component of faculty supply estimates. As part of this study, the author 
obtained the SED microdata for 1993 and calculated this statistic by discipline. The results were 
verified against those data produced with WebCaspar (which were not available at the time of 
the site license request for microdata). 



Table 3: 1993 Doctoral Recipients Entering Academe 



Academic Discipline 


Total 
Number of 
Doctorate 
Degrees 


Doctorates w/ 
Definite or 
Seeking Post- 
Sec/Med 
Appt 


Percent 

Seeking 

Post- 

Sec/Med 

Appt 


TOTAL OF ALL ACADEMIC DISCIPLINES 


39,801 


11,438 


28.7% 


+ S&E TOTAL (INCL MEDICAL/OTH LIFE SCI) 


26,640 


4,973 


18.7% 


+ S&E TOTAL (EXCL MEDICAL/OTH LIFE SCI) 


25,443 


4,514 


17.7% 


+ ENGINEERING 


5,698 


710 


12.5% 


+ PHYSICAL SCIENCES 


3,699 


233 


6.3% 


+ GEOSCIENCES 


771 


110 


14.3% 


+ MATH AND COMPUTER SCIENCES 


2,026 


682 


33.7% 


+ LIFE SCIENCES 


7,257 


987 


13.6% 


PSYCHOLOGY 


3,420 


737 


21.5% 


+ SOCIAL SCIENCES 


3,769 


1,514 


40.2% 


+ HUMANITIES 


2,973 


1,971 


66.3% 


RELIGION AND THEOLOGY 


500 


206 


41.2% 


ARTS AND MUSIC 


862 


485 


56.3% 


ARCHITECTURE AND ENVIRONMENTAL DESIGN 


54 


20 


37.0% 


+ EDUCATION 


6,689 


2,601 


38.9% 


BUSINESS AND MANAGEMENT 


1,282 


785 


61.2% 


COMMUNICATION AND LIBRARIANSHIP 


391 


228 


58.3% 


LAW 


29 


4 


13.8% 


SOCIAL SERVICE PROFESSIONS 


237 


94 


39.7% 


VOCATIONAL STUDIES AND HOME ECONOMICS 


57 


32 


56.1% 


OTHER NON-SCIENCES OR UNKNOWN 
DISCIPLINES 


87 


39 


44.8% 



With multiple survey years, it is possible to track the perception of opportunity in aca- 
deme. However, respondents only report their desire to enter academe. It is necessary at some 
point to qualify the results by investigating with the SDR and its base file, the Doctorate Records 
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File (which is taken from the SED), whether doctoral recipients follow through with their inten- 
tion to enter academe. 

Ethnicity/cith zenship/ gender 

Ethnicity and gender data from the SED are critical to calculations of AA/EEO faculty 
availability. These and other demographic data are also collected with the samples of the SDR 
and the NSOPF and in the population surveys of the IPEDS S and SA, the GSS, and the 
CGS/GRE. All models of faculty supply and demand may be qualified by ethnicity and gender, 
but few are. COSEPUP (1995) incorporates gender and ethnicity in its statements about trend 
data. Various NRC, NSF, and NCES reports document demographic characteristics. However, 
no attention is given by Massy and Goldman (1995) or Bowen and Sosa (1989) to these vari- 
ables. Bowen and Rudenstine (1992) briefly discuss trends in doctoral recipients by race and 
ethnicity, but give somewhat more focus to issues facing women graduate students. 

Demographic breakouts of the population surveys of the IPEDS EF, the GSS, and the 
CGS/GRE provide data on gender and ethnicity trends and patterns by tier and type of institu- 
tion. These are sufficient for the enrollment by discipline component of faculty supply and de- 
mand models. The IPEDS SA provides gender by rank within tenure status for the entire popu- 
lation of full-time, instructional faculty and the IPEDS S provides gender within ethnicity by 
rank within tenure status for the entire population of full-time, instructional, research, and public 
service faculty. Since neither IPEDS survey is by discipline, they are of less utility in supply and 
demand models. When the sample SDR and NSOPF surveys of faculty are analyzed, problems 
arise in preparing cross-tabs by demographic variables, due to the small cell sizes involved. 

In the paper "Developing Benchmarks for Faculty Hiring" (Milam, 1997), this author as- 
sesses whether the national datasets may be used to construct the critical cross-tab of interest for 
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AA/EEO studies. This cross-tab includes gender within ethnicity for the columns and rank 
within tenure for the rows, for each discipline at each institution. The results of this analysis 
show that it is impossible to construct this cross-tab, even aggregated by combinations of Carne- 
gie classification and control. While the SDR and NSOPF may be used to estimate gender and 
ethnicity data, weighted perhaps by the IPEDS S, the cell sizes are still inadequate. Some vari- 
able has to dropped in the estimation of the faculty population in order to increase cell sizes. Un- 
fortunately, ethnicity is often the first to go, followed by gender. 

One variable which must be retained for purposes of faculty and supply and demand is 
citizenship. Non-resident alien, doctoral students returning to foreign institutions and holding 
temporary visas should be excluded from consideration as potential faculty members. In docu- 
menting the WebCaspar reports on postdoctoral plans, it was discovered that the SAS program 
used by QRC to report on the percentage of students with plans to enter academe includes some 
non-resident aliens. In reporting of ethnicity, it is necessary to use the citizenship field carefully 
so as not to over-report the number of potential faculty. 

For those AA/EEO officers who wish to calculate faculty availability within the critical 
cross-tab of interest, it is possible to construct models of the population by gender and ethnicity 
by rank and tenure using the IPEDS S, then weight these data by Carnegie and control to the dis- 
ciplinary breakouts which are possible of the SDR and NSOPF surveys. 

Post-docs 

The GSS, SED, and SDR each collect data on post-doctoral appointments. The GSS 
documents post-doctoral enrollment by discipline, gender, and ethnicity by institution. The SED 
collects individual responses from those who expect to enter post-docs, while the SDR sample 
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survey includes data from the SED about post-doctoral intention and also provides verification 
with current data about whether the respondent actually obtained a post-doctoral appointment. 

Unfortunately, the GSS only collects data on science and engineering post-docs. The 
SDR, in contrast, has two components - science and engineering and humanities. Using the SDR 
for data on post-docs is problematic, though. The cell sizes are small at best at the discipline 
level. The results of the GSS differ too from those obtained when using the SDR to estimate the 
number of post-docs, for the total and by discipline. 

On cursory reading, it appears that the NSOPF collects data on the entire faculty career. 
However, a review of the questionnaire and data elements shows that no data are collected about 
post-doctoral appointments, only about appointments such as teaching assistantships held in 
graduate school. 

Given these constraints, a complex model of faculty supply and demand needs to incor- 
porate data on post-doctoral appointments in several different ways. The number of SED re- 
spondents with definite plans or seeking post-docs must be taken into account as a factor which 
reduces the number of potential faculty members. Post-docs also must be considered as com- 
peting with recent doctoral recipients for academic appointments. To predict this movement, 
researchers need to make assumptions about the average length of post-doc appointments. 

Massy and Goldman (1995) use the estimate of one year. 

According to NSF staff, the preliminary results from a current NSF research study about 
post-docs suggests that the average is much longer, perhaps as long as three years for some 
fields. This obviously confounds the modeling of the faculty pipeline. Assumptions need to be 
made about each discipline and the career paths of new faculty, particularly about the growing 
use of post-doctoral appointments. 
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Massy and Goldman (1995) use post-docs in their equations for predicting departmental 
behavior, suggesting a plausible relationship between the number of post-docs and research 
funding. Certainly, centers of research at individual institutions and patterns of research funding 
by discipline have an impact on the training and marketability of post-docs. Neither Bowen and 
Sosa (1989) or Bowen and Rudenstine (1992) incorporate post-docs in their analysis of faculty 
supply models. 

Faculty workload 

As stated in the discussion of enrollment. Massy and Goldman (1995) incorporate overall 
institutional enrollment and the number of majors in their regression equations and Bowen and 
Sosa (1989) include workload in their projects of demand. Bowen and Sosa are much more sim- 
ple in their approach, calculating ratios based on student FTE. Workload must be seen along 
with enrollment as a critical indicator of demand. 

Despite problems in the NCES definition of FTE, this is the only student measure of 
workload worth considering. To be consistent, faculty FTE must be based on the IPEDS S and 

SA definitions of full-time faculty. Data on workload by faculty discipline are not collected ex- 

/ 

cept in the NSOPF survey, which has a different definition of faculty from that used by the 
IPEDS S and SA and the SDR. 

The same problems of documenting the faculty population by discipline outlined above 
apply to workload. However, average workload in terms of number of courses taught is a much 
less problematic measure in the NSOPF data. Assumptions may be made based on the NSOPF 
data about the average number of courses taught and the number of student credit hours gener- 
ated per full-time faculty member by discipline. It may even be possible to estimate this work- 




20 



load ratio by rank and tenure status, though the cell sizes begin to diminish if broad clusters of 
disciplines are used. 

The NSOPF collects student credit hours awarded and enrollment for each course taught. 
Once these are equated to student FTE, more complex SCH or FTE ratios per faculty FTE may 
be calculated. Estimates of student FTE workload by discipline become a useful tool for esti- 
mating potential faculty demand. 

One problem with this approach, besides the inability to properly weight the NSOPF data 
by discipline to the total faculty population, is the need to also account for faculty workload by 
part-time faculty and by non-instructional staff. If student SCH are used as the denominator and 
full-time faculty FTE are used as the numerator, the average faculty workload will be overesti- 
mated. Therefore, complex models of faculty demand based on workload must account for all 
sources of teaching FTE. This justifies some of the confusion which exists in comparing the 
definition of faculty used for the NSOPF with that of the IPEDS S and SA. 

Even when data on part-time faculty by discipline are available, as they are in the 
NSOPF, the calculation of part-time faculty FTE and workload is problematic. Hopefully, ratios 
of student SCH to full-time faculty, while slightly over-inflated, are consistent over time, making 
them more useful in predicting faculty demand. Also, it is clear from studies of the growing reli- 
ance on non-tenure track faculty that many of these new positions, created to meet growing en- 
rollment needs, are being filled with part-time, visiting, or restricted faculty . While there is 
growth in enrollment, its impact may be to increase the need for part-time faculty, not to generate 
need for those SED respondents seeking academic work. 
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Rank 



In estimating the faculty population, the SDR and NSOPF sample surveys and the IPEDS 
S and SA population surveys collect data about the critical variables of rank and tenure status. It 
is also possible to make assumptions about the tier structures of Carnegie classification and con- 
trol using the IPEDS surveys. 

When the data are examined closely, some important components of faculty supply and 
demand variables are missing. For example, while the IPEDS S collects the number of new full- 
time faculty hires by tenure status, data are not collected on new hires by rank. This makes it 
impossible to document patterns in the hiring of new assistant professors. 

If the IPEDS S were collected every year, changes in the number of faculty by tenure 
status could be analyzed in relationship to the new hire data and assumptions could be made 
about new hires by rank. With biennial data, it is difficult to build this type of model. The S also 
documents the part-time faculty population. Data on new hires among part-time faculty are sus- 
pect, though, since returning part-time faculty are sometimes considered new because of their 
contract length. 

The IPEDS SA is particularly useful for mapping the growing reliance on non-tenure 
track faculty. The AAUP Faculty Salary Survey, which is identical in many respects to the 
IPEDS SA, also includes a section on continuing faculty. From this section, it is possible to es- 
timate the number of new faculty by rank and tenure status, though the report is not intended to 
be used in this manner and some of the results may be suspicious. Since many schools simply 
submit their IPEDS SA survey to AAUP, patterns in the AAUP data on continuing faculty may 
be unsupported in the total population. 
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The NSOPF Institutional Survey includes survey items about the number of instructional 
and non-instructional faculty hired in Fall 1992, but these also are not broken out by rank. The 
annual CUPA survey includes new assistant professors as a special category of faculty for data 
collection and these data could be used to estimate hiring patterns among certain types of institu- 
tions. Still, this critical piece of faculty supply and demand models is missing from the data. 

The NSOPF Faculty Survey includes a question which documents faculty rank and the 
year in which it was obtained. Using data on year of degree, estimates may be made about the 
length of time necessary to earn rank promotions. Obviously, assumptions may be made about 
standard practices for awarding the associate professor rank upon granting tenure in the seventh 
year. This practice will vary, of course, by discipline and tier of institution. 

The SDR and NSOPF sample surveys may also be used to estimate the faculty population 
by rank. However, since these surveys are inadequate for estimating the population by disci- 
pline, the value of the rank data is greatly diminished and it makes better sense to use the IPEDS 
S and SA. 

Central to the use of SED data for faculty availability and supply and demand models are 
assumptions about rank transitions. It is unfortunate that the NSOPF does not document the year 
of each rank change, only that of the current rank. It is possible, though, to build estimates of 
rank transitions and length of time in rank by discipline. The SDR only collects current rank 
and does not include year of rank, making it even less useful for predicting these transitions. 

Massy and Goldman recognize that NSOPF data are insufficient for modeling rank tran- 
sitions. For their projections, they collected rosters on 3, 970 faculty by field at ten institutions 
for periods of time ranging from 1968 to 1992. From these data, they estimate rank transitions, 
quit rates, and retirements. For each reporting period, faculty members could have remained at 
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their current rank, received a promotion, or left. The results are calculated as percentages for 
each combination of Carnegie and control. This author had hoped that the SDR data could be 
used in this manner, but the cell sizes of the number of faculty who appear more than once in the 
SDR sample over time are very small. If more detailed data were collected in the SDR or the 
NSOPF about faculty histories by rank and tenure status, this type of research would be much 
more fruitful. 

Tenure 

As explained above, the tenure data collected in the NSOPF allow for some estimates of 
tenure change and length of time in tenure status by discipline. These calculations could be ap- 
plied along with rank transition data to the faculty population, with projections of future promo- 
tions and their impact on openings for new hires. The data on tenure and rank could also be used 
in conjunction with faculty workload to better estimate faculty supply and demand. For exam- 
ple, it may be assumed that senior faculty will generate fewer SCH. If enrollment growth is to be 
accommodated given a greying professoriate, there needs to be increased hiring of new assistant 
professors. 

Fortunately, the IPEDS S collects data on new hires by tenure status. These data do not 
seem to appear in the literature, except for NCES publications. At least they do not appear in 
models of faculty supply and demand. They are particularly telling and should not be over- 
looked. 

As in the rank data, estimates of tenure status by discipline may be made using the 
NSOPF data. Once expanded to estimates by Carnegie classification and control, the cell sizes 
may be too small, but the data are available at the discipline level. 
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Studies of the IPEDS S and SA data in the same survey year may yield interesting results 
in the patterns of growth in research and public service faculty, something also little documented 
in the literature. These data may be particularly telling for non-tenure track research faculty. 

This author suspects that, except in land grant institutions, the use of public service faculty is 
minimal and probably on the decline, given decreases in state appropriations and support for ag- 
ricultural and extension programs. 

Surprisingly, Massy and Goldman (1995) do not address issues of tenure track status in 
their supply and demand models. Bowen and Sosa (1989) use tenure as a factor when calculat- 
ing quit rates, assuming in their standard model that 0.5% of tenured faculty will leave higher 
education each year. 

Quit Rates 

Massy and Goldman (1995) collect their own data on rank transitions and exits using fac- 
ulty rosters. Bowen and Sosa's (1989) model of faculty mobility is based on data from the SDR 
and previous research by Radner and Kuh (1978). This model includes a combination of as- 
sumptions about faculty quit rates, age cohorts, tenure status, and discipline. 

The SDR data, as well as the NSOPF data, may be used to estimate the faculty population 
by age. While the surveys were not stratified by discipline, it is reasonable to use both for age 
data by field, in part for reasons which will be noted below in the discussion of discipline data. 
Bowen and Sosa build cohorts of age groups based on age and tenure status by clusters of disci- 
plines. A high quit rate of 1% for tenured and 10% for non-tenured faculty is applied, along with 
standard quit rates of 0.5% and 5% respectively. Surprisingly, there is no differentiation in this 
model between tenure track and restricted faculty. Although there is increased reliance on re- 
stricted faculty and decreased hiring of tenure track faculty, it is to be expected that restricted 
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faculty are much more likely to leave higher education, largely because of mobility issues and 
decreased job stability. Another problem is that lateral movement at the same rank and tenure 
status between institutions is not accounted for by these models. 

Data from the 1987 Humanities Profile based on the SDR and analyzed by the NRC 
(1989) are used by Bowen and Sosa for validation of their results. The authors calculate an aver- 
age quit rate of 1 .8% for all faculty, comparable to the NRC's calculation of 1 .85%. However, 
both sets of assumptions are for cohorts of age groups and need to be examined closely. To use 
the SDR, survey items about age, occupation code, previous job code, and current work respon- 
sibilities must be all taken into account. While previous researchers have exercised due caution 
in using these variables, the methodological reports for the SDR and item response rates suggest 
that the some of the validity of the data need to be questioned. For example, the imputation of 
missing data using hot and cold deck procedures may be questioned for the variables which are 
selected. The sample selection for the SDR is drawn based on the field of doctorate, using SED 
discipline codes, and does not attempt to stratify by discipline or build a base of faculty data. 

Retirement 

A significant body of research has been conducted about faculty retirement projections 
and this will not be reviewed here. Massy and Goldman (1995) assume that their quit rate model 
based on faculty rosters will include some retirements. They include multiplier retirement rates 
of 0.025 and 0.030 in the simulation results, without any variation by field. Bowen and Sosa 
(1989) rely on AAU data on public universities collected by Lozier and Dooris (1987) which 
provide a distribution of retirements by age groups. These data suggest, for example, that the 
average retirement age is 65.1 years. The authors apply these rates by age group to the SDR 
data. 
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Once a set of assumptions are made about retirement rates by age, the data on rank and 
tenure status in the SDR and NSOPF data become more useful for predicting exits. In addition, 
the NSOPF collects data on expected retirement age. Using these data by discipline, projections 
may be made about exits. It would be interesting to compare the average retirement age by dis- 
cipline obtained with the NSOPF, which does not include retired faculty, with that of the SDR, 
which does (until age 76). Partly in response to this question, an analysis was done of the SDR 
microdata on year of retirement. These data show that, for whatever reason, few respondents are 
retired (or report retirement year). 

Mortal ity/Disabil ity 

Bowen and Sosa use gender-specific mortality rates supplied by TIAA-CREF for each 
five-year age cohort. These are used alongside data on quit rates, rank promotions, and retire- 
ment to predict overall survival rates. The results of these survival ratios range from 83.1% for 
faculty age 30 to 34 to 40.1% for those age 60 to 64. The authors hope that these data will fa- 
cilitate research on the impact of changes in pension plans and retirement laws. 

The SDR also collects data on deceased faculty, since techniques for ensuring high re- 
sponse rates code documentation about whether this is the reason for non-response. It may be 
possible somehow to validate mortality rate estimates, since the TIAA-CREF mortality tables are 
not specific to faculty. 

Departmental Behavior 

The research of Massy and Goldman on departmental choice is perhaps the most inter- 
esting aspect of their supply and demand model. Based on interviews with 344 faculty at 19 in- 
stitutions, they found that "the natural production rate of doctorates is driven by departmental 
needs for research and teaching assistants, and that departmental doctoral-student intake is lim- 
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ited by financial constraints rather than output-market considerations" (p. 1-4). "The labor mar- 
ket was not referenced as a formal criterion to determine the number of students to admit, but 
many faculty believe it influences the application pool and the types of jobs that graduates ob- 
tained" (p. 1-5). 

Bowen and Rudenstine (1992) discuss a number of factors related to department and pro- 
gram evaluation, including issues of quality and scale; the evolution of top tier programs; re- 
quirements and program content; and program design oversight, and culture. 

While it is important to consider these factors as relevant to faculty supply and demand, it 
is very difficult to quantify them as part of a complex model. Perhaps the most important insight 
is that drawn by Massy and Goldman that it is departmental needs, not information about the la- 
bor market, which determines the number of doctoral students who are admitted. While the 
authors build complex regression equations based on variables which are associated with de- 
partmental behavior, the results are less informative than the descriptive summaries of the inter- 
views. 

Tier/sector structures 

Many reports from NCES, NSF, and NRC detail faculty data by Carnegie and control. 
Depending on the degree of breakout, numerous combinations of cells may be produced for 
stratification. Decisions about reporting results in this manner become particularly important 
since the cell sizes of the NSOPF and SDR samples are so small in the cross-tabs of interest. 

Data from the IPEDS surveys lend themselves to reporting by Carnegie and control, be- 
cause most of the universe of institutions is included. Using the total population of faculty pro- 
vided by the IPEDS S or SA, it is possible to weight disciplinary data from the SDR or the 
NSOPF to the population. Many reports about SDR data use the population estimates by occu- 



pation code or Camegie/control which are weighted to census data. However, in trying to us the 
SDR weighting scheme to estimate the population by discipline, NSF staff observed that it is 
very difficult to understand the stratification which was used and therefore the appropriate table 
of standard errors and weightings. 

In using the microdata for the SDR, a data administration error in the 1 993 file was found 
which suggests that reported information about Carnegie and control is seriously in question. 
Cross-tabulations of Carnegie classification by other variables which indicate the type of educa- 
tional institution resulted in Research I universities being coded as Two Year College employer 
types. This problem was not detected in the methodological report. It is possible to work around 
it by relying on institutional identifiers such as FICE codes. However, FICE codes are assigned 
based on the name of the institution reported by each respondent. If the name is not' completed 
or not found on the current list, it is left blank. Institutional type variables are then imputed, in 
part based on the same erroneous variables. 

Massy and Goldman invest a significant amount of their research in constructing new 
types of tiers of institutions. Using data on faculty, degrees, finance, and post-docs, they conduct 
factor analyses and use the loading results to place institutions into categories for each of the 12 
disciplines examined. "As might be expected, the more prestigious schools tend to come out at 
the top" (p. 5-11). These segments of schools suggest meaningful relationships between the 
variables of interest. 

There is another reason for paying attention to issues of faculty supply and demand by 
Carnegie and control or other types of segments. It must be assumed, following the analysis of 
graduate programs from Bowen and Rudenstine (1992), that not all graduate programs are equal. 
This suggests that a tier structure is in effect in regards to hiring new faculty and in faculty mo- 
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bility. The elite segment of institutions is not open to potential faculty from what are considered 
to be second class institutions. 

To demonstrate this phenomena, it would be interesting to use the SDR to produce a 
cross-tab with the Carnegie classification of the doctorate against the Carnegie classification of 
the faculty position. If the SDR Carnegie data cannot be cleaned up, the NSOPF data may be 
used for the same analysis. 

Further assumptions may be made, based on this type of analysis, about movement be- 
tween institutions. If Harvard, Stanford, and MIT only hire faculty from within a select segment 
of institutions, then a separate faculty supply and demand model needs to be constructed and 
models that are not sensitive to this tier phenomena will be inadequate. Similarly, four-year, 
public comprehensive institutions can not hope to attract doctoral recipients from elite institu- 
tions. This calls for further investigation of faculty pipeline issues surrounding graduate educa- 
tion, such as mentoring, research sponsorship, and maintaining cadres of doctoral students to 
work with senior researchers both during and after the Ph.D. It is difficult to quantify these is- 
sues. It seems doubtful that simply including more of these related variables in a regression 
equation, such as done by Massy and Goldman, will have any effect on the endogenous vari- 
ables. 

Research activity 

Research activity impacts the financial support of graduate students, the support of post- 
docs, and the need for faculty. This is a basic measure of departmental behavior used by Massy 
and Goldman (1995) to predict faculty and supply. Research and development expenditures 
and research equipment expenditures data by discipline and institution are calculated from 
CASPAR are included used in their factor analysis for institutional segmentation. 
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It is assumed in the authors' regression model of departmental behavior that research 
monies are tied to doctoral enrollment. However, a somewhat surprising result of the simulation 
is that "without changes in academic production norms, increases in sponsored research tend to 
hurt long-term doctorate employment at rates that can easily exceed half the favorable short-term 
effect" (Massy and Goldman, 1995, p. 1-34). 

Table 4: Non-Faculty, Doctoral Research Staff 



Graduate Student Survey Academic Discipline 


Non-Faculty 
Research Staff 


+ SCIENCES AND ENGINEERING (EXCL HEALTH FIELDS) 


7,707 


+ SCIENCES (EXCLUDING HEALTH FIELDS) 


6,739 


+ ENGINEERING 


968 


+ PHYSICAL SCIENCES 


1,635 


+ EARTH, ATMOSPHERIC, AND OCEAN SCIENCES 


510 


+ MATHEMATICAL AND COMPUTER SCIENCES 


148 


AGRICULTURAL SCIENCES 


281 


+ BIOLOGICAL SCIENCES 


3,604 


+ PSYCHOLOGY 


365 


+ SOCIAL SCIENCES 


196 



This suggests that the relationship between patterns of sponsored research by discipline 
and faculty supply and demand is not simplistic. Increased reliance on post-docs and on non- 
tenure track faculty may decrease demand for the tenure track appointments which many doc- 
toral recipients believe they can obtain. 

One additional "holding pattern" or alternative career path which needs to be explored is 
the use of non-faculty research staff with doctorates. These data are collected on the GSS, but 
are rarely reported on. These staff are defined by the survey as "all doctoral scientists and engi- 
neers who are involved principally in research activities but are not considered either postdoc- 
toral appointees or members of the regular faculty" (NSF, 1995, p. 53). It may be assumed that 
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they could compete with recent doctorates and post-docs for faculty jobs. Table 4 documents 
these research staff by discipline. 

Disciplines 

Much of the discussion about assumptions centers around the availability of data by dis- 
cipline. Statements have been made about problems in the SDR and NSOPF sampling method- 
ologies and about the ability to build crosswalks in taxonomies between different datasets. 

' Many policy issues may be investigated by examining faculty characteristics individu- 
ally. Data on faculty age, for example, document the continued greying of the professoriate. If 
this analysis is extended to the question of whether growth in tenured faculty is preventing new 
appointments, a second characteristic may be added. For these two and three-way cross- 
tabulations, the cell sizes of the SDR and NSOPF samples are adequate. If these are extended, 
however, to make assumptions about institutional segments or tiers, the cell sizes are inadequate. 
Sometimes, researchers fail to acknowledge this and prepare analyses, weighting them with ap- 
proved weights to the population. While the standard errors may fall within acceptable ranges, 
the weighting methodology becomes suspect. In the SDR 1993, it is impossible to construct new 
cross-tabs of interest and weight them to the population, because the stratification, sampling, and 
weighting processes were so complex and even now not fully understood by some NSF staff. 
This is true for many different cross-tabs of characteristics, but is particularly important to rec- 
ognize in issues surrounding discipline. 

The SDR does not collect detailed data on faculty member's discipline. Many reports use 
the discipline of the Ph.D., in part because the only departmental affiliation which is recorded in 
the survey is occupation code. For post-secondary faculty, there are only 29 occupation codes 
for scientists and engineers and a comparable number for those in the humanities. The method- 
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ology report for the 1993 SDR suggests that this variable is much misunderstood for many rea- 
sons. In examining the administration and "cleanliness" of the microdata, it is apparent that the 
data files are not intended to be used in this way. For example, some post-secondary occupation 
codes are held by persons who do not work at higher education institutions. Some Research I 
institutions are listed as Two-Year College employer types. 

The SDR occupation code data are weighted to U.S. Census data and its projected annual 
increases. The author would feel more confident if the post-secondary data were weighted to the 
population of faculty provided with the IPEDS S or SA. NSF staff report that estimates of the 
population are roughly comparable to those calculated with the NSOPF. 

Problems in the NSOPF study have been discussed, most notably that the measure of size of 41 .5 
for sampling with certainty among research and doctoral institutions is questionable. While 
standard errors for cross-tabs of interest may fall within acceptable levels, methodological con- 
cerns are raised when this MOS is used to predict the faculty population by discipline. This is 
particularly vexing because the NSOPF could have been stratified by discipline and wasn't, even 
though discipline was available in the faculty rosters from whom the sample was drawn. Nu- 
merous conversations have been held with NCES, AAUP, NSF, and other agency staff about this 
issue. Out of this, the author has put forward the statement that while the methodology provides 
a valid estimate of the faculty population, it is not a good estimate. 

The most difficult part of this discussion is that there is no single source of data for the 
faculty population by discipline to which the NSOPF results could be compared. The SDR post- 
secondary occupation codes are inadequate. Furthermore, no crosswalk has been built between 
the WebCASPAR taxonomy and the occupation codes. This is not because NSF and QRC have 





Table 5: Comparison of Random Sample vs. Population of NRC Data by Discipline 



Discipline 


N Random 
Sample 


Discipline 
% in 
Sample 


Discipline 
% in Pop 


%Change 
Sample vs. 
Pop. 


N of 
Estimate 


N of 
Population 


N 

Over- 

estimate 


% Over- 
estimate 


12 


520 


4.9% 


4.6% 


-0.3% 


4,319 


4,082 


237 


5.8% 


13 


159 


1.5% 


1.6% 


0.1% 


1,321 


1,420 


-99 


-7.0% 


14 


111 


1.0% 


1.4% 


0.4% 


922 


1,235 


-313 


-25.3% 


15 


357 


3.4% 


1.0% 


-2.3% 


2,965 


925 


2,040 


220.6% 


16 


52 


0.5% 


0.7% 


0.2% 


432 


621 


-189 


-30.5% 


17 


33 


0.3% 


0.8% 


0.5% 


274 


700 


-426 


-60.8% 


18 


24 


0.2% 


0.6% 


0.4% 


199 


547 


-348 


-63.6% 


19 


49 


0.5% 


1.0% 


0.5% 


407 


889 


-482 


-54.2% 


20 


31 


0.3% 


0.7% 


0.4% 


257 


578 


-321 


-55.5% 


21 


17 


0.2% 


0.4% 


0.2% 


141 


357 


-216 


-60.4% 


22 


33 


0.3% 


0.4% 


0.1% 


274 


381 


-107 


-28.1% 


23 


620 


5.8% 


4.4% 


-1.4% 


5,150 


3,881 


1,269 


32.7% 


24 


541 


5.1% 


4.9% 


-0.2% 


4,493 


4,285 


208 


4.9% 


25 


432 


4.1% 


5.0% 


1.0% 


3,588 


4,436 


-848 


-19.1% 


26 


235 


2.2% 


2.6% 


0.4% 


1,952 


2,284 


-332 


-14.5% 


27 


222 


2.1% 


2.3% 


0.2% 


1,844 


2,028 


-184 


-9.1% 


28 


73 


0.7% 


1.1% 


0.4% 


606 


983 


-377 


-38.3% 


29 


40 


0.4% 


0.6% 


0.2% 


332 


522 


-190 


-36.4% 


30 


479 


4.5% 


3.9% 


-0.6% 


3,978 


3,443 


535 


15.6% 


31 


306 


2.9% 


2.9% 


0.0% 


2,542 


2,543 


-1 


-0.1% 


32 


143 


1.3% 


1.5% 


0.1% 


1,188 


1,284 


-96 


- 7.5% 


33 


153 


1.4% 


2.1% 


0.7% 


1,271 


1,845 


-574 


-31.1% 


34 


122 


1.1% 


1.3% 


0.2% 


1,013 


1,188 


-175 


-14.7% 


35 


1,075 


10.1% 


5.7% 


-4.4% 


8,929 


5,043 


3,886 


77.1% 


36 


245 


2.3% 


3.1% 


0.7% 


2,035 


2,694 


-659 


-24.5% 


37 


295 


2.8% 


3.7% 


0.9% 


2,450 


3,278 


-828 


-25.3% 


38 


228 


2.1% 


2.7% 


0.5% 


1,894 


2,365 


-471 


-19.9% 


39 


169 


1.6% 


2.3% 


0.7% 


1,404 


2,039 


-635 


-31.2% 


40 


82 


0.8% 


1.5% 


0.8% 


681 


1,363 


-682 


-50.0% 


41 


37 


0.3% 


0.6% 


0.3% 


307 


551 


-244 


-44.2% 


51 


879 


8.3% 


7.0% 


-1.3% 


7,301 


6,186 


1,115 


18.0% 


52 


779 


7.3% 


6.5% 


-0.9% 


6,470 


5,718 


752 


13.2% 


53 


247 


2.3% 


3.3% 


1.0% 


2,052 


2,899 


-847 


-29.2% 


54 


328 


3.1% 


3.3% 


0.2% 


2,724 


2,934 


-210 


-7.1% 


55 


396 


3.7% 


3.0% 


-0.7% 


3,289 


2,657 


632 


23.8% 


56 


486 


4.6% 


4.8% 


0.2% 


4,037 


4,201 


-164 


-3.9% 


57 


384 


3.6% 


3.4% 


-0.3% 


3,189 


2,968 


221 


7.5% 


61 


57 


0.5% 


0.7% 


0.2% 


! 473 


618 


-145 


-23.4% 


62 


65 


0.6% 


1.1% 


0.5% 


540 


959 


-419 


-43.7% 


63 


48 


0.5% 


0.6% 


0.2% 


399 


570 


-171 


-30.1% 


65 


68 


0.6% 


0.8% 


0.2% 


565 


708 


-143 


-20.2% 


Total 


10,620 


100.0% 


100.0% 


0.0% 


88,208 


88,208 


0 


0.0% 
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not had time or funding to do so, but because the premise that occupation codes are useful in this 
manner is flawed. NCES staff have already undergone severe methodological criticism for the 
under-sampling of health science faculty. When the standard errors fall within acceptable 
ranges, it is difficult to raise another design issue. For this reason, the author constructed a 
simulation to determine whether the MOS of 41.5 by institution is a good predictor. 

One source of faculty data by discipline is the NRC study of doctoral program rankings 
conducted in 1982 and in 1993. Surprisingly, Massy and Goldman (1995) use the 1980 NRC 
data in their prediction of faculty supply and demand. They recognized it as the only source of 
population data, albeit of a small segment of the faculty population, that of research and doctoral 
institutions and departments with research doctorate programs. 

The current NRC doctoral ranking data are available on CD-Rom and include a roster of 
all 88,208 faculty, including data on their institution and discipline. A SAS program was written 
to select a random sample from the 88,208 faculty, drawing 41 .5 faculty at random from each 
institution. As in the NSOPF methodology, for those institutions with less than 41.5 faculty, all 
faculty were selected. Table 5 depicts the results of this analysis. The percent of faculty in each 
discipline are calculated from the sample of 10,620 faculty, then weighted to the population. In 
only a few disciplines was the prediction of faculty by discipline close to the actual data. 

While it is possible to use the NRC doctoral rankings data, as Massy and Goldman have 
done, to estimate the faculty population by discipline, discussion with NRC staff suggest that the 
data were never intended to be used in this manner. They are useful, though, in suggesting that 
while the NSOPF design is a valid estimate, it is not a good estimate. 
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IV. Constructing a Complex Model 



A complex model of faculty supply and demand may be constructed based on the 1 9 
types of assumptions which appear in the research. This model has the following components: 

(1) Enrollment - with undergraduate, masters, and doctoral enrollment from the IPEDS 
EF reports. These data may be broken out by gender and ethnicity. 

(2) Degrees - with IPEDS C and SED data used to document masters and doctoral de- 
grees and recipient characteristics. Degree data are qualified with assumptions about 
the percent of graduates who plan to enter academe by discipline. Doctoral data by 
discipline are qualified with assumptions about financial support and time-to-degree. 

(3) Post-docs - with data by discipline on the number and percentage of students with the 
appointments. Assumptions about length of appointment need to be developed. 

(4) Non-faculty research staff - this temporary holding pattern needs to be included 
with data from the GSS. Assumptions need to be made about this type of position. 

(5) Faculty population - documentation of the population of full-time, instructional, re- 
search, and public service faculty, broken out by rank within tenure status by disci- 
pline and by tier of institution. Assumptions need to be made about rank transitions, 
retirement rates, quit rates, and mortality by discipline. 

(6) Faculty workload - projections need to be calculated based on workload data for 
full-time faculty. Assumptions need to be made about other teaching FTE. 

(7) Research activity - need for post-docs, non-faculty research staff, and degree pro- 
ductivity are linked to research funding by discipline. 

Are the data for this model available in the national datasets? With each assumption, the 
data availability and data administration issues were described. There are only incomplete data 
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on post-docs. No data are available on rank transitions, only year of current rank and tenure. No 
discipline data are available to weight a sample to the population. 

Many critical components of the model can be completed, however. The NSOPF and 
SDR are very useful in constructing age cohorts for retirement and mortality assumptions. The 
SED and NSOPF are helpful in assumptions about time-to-degree. The SED is essential for 
documentation of employment plans and the NSOPF for faculty workload issues. The popula- 
tion data of the IPEDS C and EF are critical to the enrollment component, and the CGS/GRE and 
GSS provide additional breakouts by graduate program. No other data on non-faculty research 
staff exist besides what is collected with the GSS. 

This research suggests that while existing data collection efforts allow for many types of 
complex policy studies about faculty, it is impossible to construct a complex model of faculty 
supply and demand. The studies of Massy and Goldman (1 995) and Bowen and Sosa (1 989) are 
flawed because they do not adequately document the faculty population by discipline. 

Only simple descriptive statistics may be produced to test questions about Ph.D. overpro- 
duction. This is somewhat of a disappointment, given the promise of existing research and the 
efforts of this current study to investigate the microdata. The best approach to estimating over- 
production is to calculate the total number of potential job seekers (from the SED) as a ratio of 
total full-time instructional faculty (from the IPEDS SA), over time. While doctoral unemploy- 
ment studies using the SDR are interesting, they are not valid when applied to the faculty popu- 
lation. Table 6 presents the results of this analysis: 

These ratios of the number of doctoral graduates seeking academic employment to the 
number of full-time, instructional faculty suggest that job hunting was much easier in the early 




44 



37 



1970s, but became increasingly more difficult by the late 1980s. Current data are comparable to 
the late 1970s. 

Table 6: Trends in Academic Job Seekers/Total Full-Time Faculty 



Year 


#Job 

Hunters 


# Faculty 


#Fac per 
Graduate 


71 


12,989 


320,844 


25 


72 


12,546 


328,234 


26 


73 


12,076 


341,998 


28 


74 


10,802 


na 


na 


75 


10,571 


369,281 


35 


76 


10,236 


377,157 


37 


77 


9,299 


386,880 


42 


78 


8,622 


389,001 


45 


79 


8,478 


395,968 


47 


80 


8,224 


396,402 


48 


81 


7,952 


400,772 


50 


82 


7,430 


406,795 


55 


83 


7,221 


407,799 


56 


84 


6,823 


na 


na 


85 


6,795 


395,912 


58 


86 


6,710 


395,857 


59 


87 


6,661 


na 


na 


88 


6,824 


430,740 


63 


89 


7,087 


na 


na 


90 


8,407 


437,128 


52 


91 


9,594 


450,356 


47 


92 


10,123 


446,930 


44 


93 


10,203 


454,104 


45 


94 


10,712 


454,008 


42 


95 


11,293 


457,913 


41 


96 


10,588 


457,692 
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V. Conclusions and Recommendations 

Good data on the faculty population by discipline are badly needed if scholars and policy 
makers are to verify critical projections about the overproduction of Ph.D.s. Discipline-specific 
data inform many other types of studies. It should not be assumed that early retirement programs 
will affect all disciplines in the same way or that changing structures of tenure are uniform across 
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fields. Research about rank transitions, the growing reliance on non-tenure track faculty, faculty 
workload, and faculty salaries must be differentiated by discipline to address policy issues at the 
appropriate unit of concern. 

Several attempts have been made by agencies to collect data by discipline on a larger 
proportion of the faculty population, including an early version of the NSF-NIH Graduate Stu- 
dent Survey. This author has recommended that the NSOPF survey be stratified by discipline. 
He has also proposed at an NPEC-sponsored meeting of the IPEDS Technical Review Panel that 
the IPEDS S be modified to collect information at the discipline level. Such as report would be 
similar to the section on gender and ethnicity by rank and tenure. The column headers would be 
rank within tenure and the rows would leave room for each discipline offered at an institution, 
using two-digit CIP codes. 

While there are many types of data to collect by discipline, the critical cross-tab of inter- 
est for faculty supply and demand is counting the number of full-time, instructional, research, 
and public service faculty by rank within tenure status at each institution. Any data collection 
effort besides headcount, such as faculty salary outlay, gender, ethnicity, or FTE, would require 
that this report be much more complex and unwieldy. While even this single page presents an 
additional reporting burden for institutions, unit record data are already being collected to pro- 
duce other IPEDS S and SA reports. As far as which taxonomy of disciplines to use, IPEDS 
respondents already use CIP codes, these are used by CUPA and Oklahoma, and a crosswalk al- 
ready exists to CASPAR. 

This proposed table of rank by tenure for each 2 digit CIP code would provide a signifi- 
cant boost to researchers' ability to conduct policy studies on faculty. It would be the first time 
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ever that data are collected on the faculty population discipline are collected. These data would 
provide an invaluable baseline for weighting all sample-based studies. 

Several other recommendations arise from this study. In addition to stratifying the 
NSOPF, it would be very helpful to collect more information about faculty members history of 
rank promotions and tenure awards. As for the SDR, it would be very useful if better occupa- 
tional code data were collected and if the sample was stratified by discipline. For data on new 
hires, the IPEDS S would be much more useful if the data on new full-time faculty hires by ten- 
ure status were expanded to include rank. 
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